AITopics | user study

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
Asia > Middle East > Jordan (0.04)
(4 more...)

Genre: Research Report > Experimental Study (0.46)

Industry:

Education (1.00)
Transportation (0.69)
Health & Medicine > Therapeutic Area (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Neural Information Processing SystemsFeb-11-2026, 16:31:44 GMT

Appendix for EmbodiedGPT: Vision-Language Pre-Training via Embodied Chain of Thought Y ao Mu

This involves introducing specific prompts such as "in

artificial intelligence, embodiedgpt, machine learning, (14 more...)

Country:

Asia > China > Hong Kong (0.05)
Asia > China > Shanghai > Shanghai (0.05)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Neural Information Processing SystemsDec-27-2025, 19:08:43 GMT

075b051ec3d22dac7b33f788da631fd4-Supplemental.pdf

artificial intelligence, attribution, machine learning, (14 more...)

Country:

North America > Canada > Newfoundland and Labrador > Newfoundland (0.05)
North America > United States > Texas > Loving County (0.04)
North America > United States > North Dakota > Billings County (0.04)

Genre:

Questionnaire & Opinion Survey (1.00)
Research Report > New Finding (0.48)
Research Report > Experimental Study (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Neural Information Processing SystemsDec-27-2025, 17:46:42 GMT

AVIS: Autonomous Visual Information Seeking with Large Language Model Agent

In this paper, we propose an autonomous information seeking visual question answering framework, A VIS. Our method leverages a Large Language Model (LLM)

information, large language model, machine learning, (17 more...)

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
(3 more...)

Genre: Workflow (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Neural Information Processing SystemsDec-23-2025, 18:13:38 GMT

Use-Case-Grounded Simulations for Explanation Evaluation

A growing body of research runs human subject evaluations to study whether providing users with explanations of machine learning models can help them with practical real-world use cases. However, running user studies is challenging and costly, and consequently each study typically only evaluates a limited number of different settings, e.g., studies often only evaluate a few arbitrarily selected model explanation methods. To address these challenges and aid user study design, we introduce Simulated Evaluations (SimEvals). SimEvals involve training algorithmic agents that take as input the information content (such as model explanations) that would be presented to the user, to predict answers to the use case of interest. The algorithmic agent's test set accuracy provides a measure of the predictiveness of the information content for the downstream use case. We run a comprehensive evaluation on three real-world use cases (forward simulation, model debugging, and counterfactual reasoning) to demonstrate that SimEvals can effectively identify which explanation methods will help humans for each use case. These results provide evidence that \simevals{} can be used to efficiently screen an important set of user study design decisions, e.g., selecting which explanations should be presented to the user, before running a potentially costly user study.

artificial intelligence, machine learning, proceedings, (12 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.77)

Borghoff, Uwe M., Minas, Mark, Schopp, Jannis

Learning to Code with Context: A Study-Based Approach

arXiv.org Artificial IntelligenceDec-8-2025

The rapid emergence of generative AI tools is transforming the way software is developed. Consequently, software engineering education must adapt to ensure that students not only learn traditional development methods but also understand how to meaningfully and responsibly use these new technologies. In particular, project-based courses offer an effective environment to explore and evaluate the integration of AI assistance into real-world development practices. This paper presents our approach and a user study conducted within a university programming project in which students collaboratively developed computer games. The study investigates how participants used generative AI tools throughout different phases of the software development process, identifies the types of tasks where such tools were most effective, and analyzes the challenges students encountered. Building on these insights, we further examine a repository-aware, locally deployed large language model (LLM) assistant designed to provide project-contextualized support. The system employs Retrieval-Augmented Generation (RAG) to ground responses in relevant documentation and source code, enabling qualitative analysis of model behavior, parameter sensitivity, and common failure modes. The findings deepen our understanding of context-aware AI support in educational software projects and inform future integration of AI-based assistance into software engineering curricula.

large language model, machine learning, natural language, (19 more...)

2512.05242

Country:

North America > United States > Florida > Miami-Dade County > Miami (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
(9 more...)

Genre:

Research Report > New Finding (1.00)
Instructional Material > Course Syllabus & Notes (1.00)
Overview (0.92)
(2 more...)

Industry:

Education > Curriculum > Subject-Specific Education (1.00)
Education > Educational Technology > Educational Software > Computer Based Training (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.69)

Pach, Mateusz, Karthik, Shyamgopal, Bouniot, Quentin, Belongie, Serge, Akata, Zeynep

Sparse Autoencoders Learn Monosemantic Features in Vision-Language Models

arXiv.org Artificial IntelligenceDec-1-2025

Sparse Autoencoders (SAEs) have recently gained attention as a means to improve the interpretability and steerability of Large Language Models (LLMs), both of which are essential for AI safety. In this work, we extend the application of SAEs to Vision-Language Models (VLMs), such as CLIP, and introduce a comprehensive framework for evaluating monosemanticity at the neuron-level in visual representations. To ensure that our evaluation aligns with human perception, we propose a benchmark derived from a large-scale user study. Our experimental results reveal that SAEs trained on VLMs significantly enhance the monosemanticity of individual neurons, with sparsity and wide latents being the most influential factors. Further, we demonstrate that applying SAE interventions on CLIP's vision encoder directly steers multimodal LLM outputs (e.g., LLaVA), without any modifications to the underlying language model. These findings emphasize the practicality and efficacy of SAEs as an unsupervised tool for enhancing both interpretability and control of VLMs. Code and benchmark data are available at https://github.com/ExplainableML/sae-for-vlm.

large language model, machine learning, natural language, (18 more...)

2504.02821

Country:

Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
Europe > Denmark > Capital Region > Copenhagen (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Spreafico, Matteo, Tassini, Ludovica, Sancricca, Camilla, Cappiello, Cinzia

Lost in the Pipeline: How Well Do Large Language Models Handle Data Preparation?

arXiv.org Artificial IntelligenceDec-1-2025

Large language models have recently demonstrated their exceptional capabilities in supporting and automating various tasks. Among the tasks worth exploring for testing large language model capabilities, we considered data preparation, a critical yet often labor-intensive step in data-driven processes. This paper investigates whether large language models can effectively support users in selecting and automating data preparation tasks. To this aim, we considered both general-purpose and fine-tuned tabular large language models. We prompted these models with poor-quality datasets and measured their ability to perform tasks such as data profiling and cleaning. We also compare the support provided by large language models with that offered by traditional data preparation tools. To evaluate the capabilities of large language models, we developed a custom-designed quality model that has been validated through a user study to gain insights into practitioners' expectations.

data mining, large language model, machine learning, (20 more...)

2511.21708

Country:

North America > United States (0.04)
Europe > Italy > Lombardy > Milan (0.04)
Asia > India > Karnataka > Bengaluru (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

arXiv.org Artificial IntelligenceDec-1-2025

Is Passive Expertise-Based Personalization Enough? A Case Study in AI-Assisted Test-Taking

Siyan, Li, Zhang, Jason, Maharaj, Akash, Shi, Yuanming, Li, Yunyao

Novice and expert users have different systematic preferences in task-oriented dialogues. However, whether catering to these preferences actually improves user experience and task performance remains understudied. To investigate the effects of expertise-based personalization, we first built a version of an enterprise AI assistant with passive personalization. We then conducted a user study where participants completed timed exams, aided by the two versions of the AI assistant. Preliminary results indicate that passive personalization helps reduce task load and improve assistant perception, but reveal task-specific limitations that can be addressed through providing more user agency. These findings underscore the importance of combining active and passive personalization to optimize user experience and effectiveness in enterprise task-oriented environments.

expertise, large language model, machine learning, (24 more...)

2511.23376

Country:

North America > United States > Florida > Miami-Dade County > Miami (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.04)

Genre:

Research Report > New Finding (1.00)
Questionnaire & Opinion Survey (1.00)
Research Report > Experimental Study (0.93)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Human Computer Interaction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.93)
(3 more...)

arXiv.org Artificial IntelligenceNov-25-2025

Towards Automating Data Access Permissions in AI Agents

Wu, Yuhao, Yang, Ke, Roesner, Franziska, Kohno, Tadayoshi, Zhang, Ning, Iqbal, Umar

As AI agents attempt to autonomously act on users' behalf, they raise transparency and control issues. We argue that permission-based access control is indispensable in providing meaningful control to the users, but conventional permission models are inadequate for the automated agentic execution paradigm. We therefore propose automated permission management for AI agents. Our key idea is to conduct a user study to identify the factors influencing users' permission decisions and to encode these factors into an ML-based permission management assistant capable of predicting users' future decisions. We find that participants' permission decisions are influenced by communication context but importantly individual preferences tend to remain consistent within contexts, and align with those of other participants. Leveraging these insights, we develop a permission prediction model achieving 85.1% accuracy overall and 94.4% for high-confidence predictions. We find that even without using permission history, our model achieves an accuracy of 66.9%, and a slight increase of training samples (i.e., 1-4) can substantially increase the accuracy by 10.8%.

artificial intelligence, machine learning, participant, (18 more...)